[WIP] Make AWQ more general #2400

jerryzh168 · 2025-06-18T04:26:00Z

Summary:

Added AWQConfig that takes a base config and made corresponding changes in other parts of the flow

Test Plan:

# Produce model
# make sure to change the model_save_path
python torchao/prototype/awq/example2.py --repo "Qwen/Qwen3-4B" --quant awq-8da4w-128 --tasks bhh --model_save_hf_hub_path jerryzh168/Qwen3-4B-8da4w-awq


# eval
lm_eval --model hf --model_args pretrained=jerryzh168/Qwen3-4B-8da4w-awq --tasks bhh --device cuda:0 --batch_size auto

Reviewers:

Subscribers:

Tasks:

Tags:

pytorch-bot · 2025-06-18T04:26:03Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2400

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 1 Unrelated Failure

As of commit 4d7eeb7 with merge base 378e179 ():

NEW FAILURE - The following job has failed:

Code Analysis with Ruff / build (3.9) (gh)
Process completed with exit code 1.

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

Run Regression Tests / test-nightly (CUDA Nightly, linux.g5.12xlarge.nvidia.gpu, --pre torch --index-url https://downloa... / linux-job (gh) (trunk failure)
##[error]The operation was canceled.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

kimishpatel · 2025-06-18T19:36:17Z

torchao/prototype/awq/api.py

+                eps=eps,
+            )
+        else:
+            observer = AWQObserver2(


can you not add kwargs to the AWQObserver and just check 'base_config' in kwargs?

yes, this is temporary, I think we can deprecate the old one in the end

kimishpatel · 2025-06-18T19:36:58Z

torchao/prototype/awq/api.py

+
+
+@dataclass
+class AWQConfig(AOBaseConfig):


Ok this is consolidating with quantize_ api's config based design?

kimishpatel · 2025-06-18T20:03:35Z

torchao/prototype/awq/api.py

+    dummy_mod = DummyModule(observed_linear.weight * equalization_scale)
+    quant_mod = base_config_handler(dummy_mod, config.base_config)


I am not sure whats happening here?. Isnt module already nn.Module?

this is just trying to quantize the weight with the quantization type specified by config.base_config

kimishpatel · 2025-06-18T20:04:40Z

torchao/prototype/awq/api.py

+    if config.set_inductor_config:
+        torchao.quantization.utils.recommended_inductor_config_setter()
+
+    observed_linear = module


If this is for linear only should you not assert that this nn.Linear? Plus how to you make sure this function is called only on nn.Linear?

yeah that's true, will add an assert, we rely on user to use quantize_ correctly (it's through specifying the filter_fn arg in quantize_ API)

ao/torchao/quantization/quant_api.py

Line 578 in 4e3d019

filter_fn: Optional[Callable[[torch.nn.Module, str], bool]] = None,

Summary: * Added AWQConfig that takes a base config and made corresponding changes in other parts of the flow Test Plan: TODO Reviewers: Subscribers: Tasks: Tags:

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 18, 2025

jerryzh168 mentioned this pull request Jun 18, 2025

[WIP] Add AWQ quantization with QDQLayout support for ExecuTorch #2399

Open

kimishpatel reviewed Jun 18, 2025

View reviewed changes

jerryzh168 force-pushed the refactor-awq branch from d682cb5 to 8b1fca1 Compare June 24, 2025 22:42

jerryzh168 added the topic: improvement Use this tag if this PR is an improvement (doesn't fit into any of the other categories) label Jun 24, 2025

[WIP] Make AWQ more general

4d7eeb7

Summary: * Added AWQConfig that takes a base config and made corresponding changes in other parts of the flow Test Plan: TODO Reviewers: Subscribers: Tasks: Tags:

jerryzh168 force-pushed the refactor-awq branch from 8b1fca1 to 4d7eeb7 Compare July 12, 2025 23:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[WIP] Make AWQ more general #2400

[WIP] Make AWQ more general #2400

jerryzh168 commented Jun 18, 2025 •

edited

Loading

Uh oh!

pytorch-bot bot commented Jun 18, 2025 •

edited

Loading

Uh oh!

kimishpatel Jun 18, 2025

Uh oh!

jerryzh168 Jun 18, 2025

Uh oh!

kimishpatel Jun 18, 2025

Uh oh!

kimishpatel Jun 18, 2025

Uh oh!

jerryzh168 Jun 19, 2025

Uh oh!

kimishpatel Jun 18, 2025

Uh oh!

jerryzh168 Jun 19, 2025

Uh oh!

Uh oh!

		dummy_mod = DummyModule(observed_linear.weight * equalization_scale)
		quant_mod = base_config_handler(dummy_mod, config.base_config)



		@dataclass
		class AWQConfig(AOBaseConfig):

[WIP] Make AWQ more general #2400

Are you sure you want to change the base?

[WIP] Make AWQ more general #2400

Conversation

jerryzh168 commented Jun 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Jun 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2400

❌ 1 New Failure, 1 Unrelated Failure

Uh oh!

kimishpatel Jun 18, 2025

Choose a reason for hiding this comment

Uh oh!

jerryzh168 Jun 18, 2025

Choose a reason for hiding this comment

Uh oh!

kimishpatel Jun 18, 2025

Choose a reason for hiding this comment

Uh oh!

kimishpatel Jun 18, 2025

Choose a reason for hiding this comment

Uh oh!

jerryzh168 Jun 19, 2025

Choose a reason for hiding this comment

Uh oh!

kimishpatel Jun 18, 2025

Choose a reason for hiding this comment

Uh oh!

jerryzh168 Jun 19, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jerryzh168 commented Jun 18, 2025 •

edited

Loading

pytorch-bot bot commented Jun 18, 2025 •

edited

Loading